神经辐射场(NERF)是数据驱动3D重建中的流行方法。鉴于其简单性和高质量的渲染,正在开发许多NERF应用程序。但是,NERF的大量的速度很大。许多尝试如何加速NERF培训和推理,包括复杂的代码级优化和缓存,使用复杂的数据结构以及通过多任务和元学习的摊销。在这项工作中,我们通过NERF之前通过经典技术镜头重新审视NERF的基本构建块。我们提出了Voxel-Accelated Nerf(VaxnerF),与Visual Hull集成了Nerf,一种经典的3D重建技术,只需要每张图像的二进制前景背景像素标签。可视船体,可在大约10秒内优化,可以提供粗略的现场分离,以省略NERF中的大量网络评估。我们在流行的JAXNERF Codebase提供了一个干净的全力验光,基于JAX的实现,其仅包括大约30行的代码更改和模块化视觉船体子程序,并在高度表现的JAXNERF之上实现了大约2-8倍的速度学习基线具有零劣化呈现质量。具有足够的计算,这有效地将单位训练从小时到30分钟缩小到30分钟。我们希望VAXNERF - 一种仔细组合具有深入方法的经典技术(可谓更换它) - 可以赋予并加速新的NERF扩展和应用,以其简单,可移植性和可靠的性能收益。代码在https://github.com/naruya/vaxnerf提供。
translated by 谷歌翻译
The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.
translated by 谷歌翻译
Timely and effective feedback within surgical training plays a critical role in developing the skills required to perform safe and efficient surgery. Feedback from expert surgeons, while especially valuable in this regard, is challenging to acquire due to their typically busy schedules, and may be subject to biases. Formal assessment procedures like OSATS and GEARS attempt to provide objective measures of skill, but remain time-consuming. With advances in machine learning there is an opportunity for fast and objective automated feedback on technical skills. The SimSurgSkill 2021 challenge (hosted as a sub-challenge of EndoVis at MICCAI 2021) aimed to promote and foster work in this endeavor. Using virtual reality (VR) surgical tasks, competitors were tasked with localizing instruments and predicting surgical skill. Here we summarize the winning approaches and how they performed. Using this publicly available dataset and results as a springboard, future work may enable more efficient training of surgeons with advances in surgical data science. The dataset can be accessed from https://console.cloud.google.com/storage/browser/isi-simsurgskill-2021.
translated by 谷歌翻译
尽管沟通延迟可能会破坏多种系统,但大多数现有的多基因轨迹计划者都缺乏解决此问题的策略。最先进的方法通常采用完美的通信环境,这在现实世界实验中几乎是现实的。本文介绍了强大的Mader(RMADER),这是一个分散的异步多轨迹计划者,可以处理代理商之间的通信延迟。通过广播新优化的轨迹和忠实的轨迹,并执行延迟检查步骤,Rmader即使在通信延迟下也能够保证安全。Rmader通过广泛的仿真和硬件飞行实验得到了验证,并获得了100%的无碰撞轨迹生成成功率,表现优于最先进的方法。
translated by 谷歌翻译
本文提出了一种新型的极化传感器结构和网络结构,以获得高质量的RGB图像和极化信息。常规的极化传感器可以同时获取RGB图像和极化信息,但是传感器上的极化器会降低RGB图像的质量。 RGB图像的质量与极化信息之间存在权衡,因为较少的极化像素减少了RGB图像的降解,但减少了极化信息的分辨率。因此,我们提出了一种方法,该方法通过在传感器上稀疏排列极化像素来解决权衡,并使用RGB图像作为指导来补偿以更高分辨率的低分辨率极化信息。我们提出的网络体系结构由RGB图像改进网络和两极分化信息补偿网络组成。我们通过将其性能与最先进的方法进行比较,确认了我们提出的网络在补偿极化强度的差异成分方面的优势:深度完成。此外,我们确认我们的方法可以同时获得更高质量的RGB图像和极化信息,而不是传统的极化传感器,从而解决了RGB图像质量和极化信息之间的权衡。基线代码以及新生成的真实和合成的大规模极化图像数据集可用于进一步的研究和开发。
translated by 谷歌翻译
无监督的域适应性(UDA)是解决一个问题的关键技术之一,很难获得监督学习所需的地面真相标签。通常,UDA假设在培训过程中可以使用来自源和目标域中的所有样本。但是,在涉及数据隐私问题的应用下,这不是现实的假设。为了克服这一限制,最近提出了无源数据的UDA,即无源无监督的域适应性(SFUDA)。在这里,我们提出了一种用于医疗图像分割的SFUDA方法。除了在UDA中通常使用的熵最小化方法外,我们还引入了一个损失函数,以避免目标域中的特征规范和在保留目标器官的形状约束之前。我们使用数据集进行实验,包括多种类型的源目标域组合,以显示我们方法的多功能性和鲁棒性。我们确认我们的方法优于所有数据集中的最先进。
translated by 谷歌翻译
多态全斜形图像(WSI)注册是一个积极的研究领域。但是,目前尚不清楚当前的WSI注册方法将如何在现实世界数据集上执行。通过使用来自常规诊断的新数据集来评估现实世界中的适用性,以验证当前WSI注册方法的性能来验证当前WSI注册方法的性能。在本报告中,我们介绍了Acrobat挑战的解决方案。我们采用两步方法,包括刚性和非刚性变换。实验结果表明,验证数据集的中位数为1,250 UM。
translated by 谷歌翻译
为了促进医学图像分割技术的开发,提供了用于多功能医疗图像分割的大型腹部多器官数据集Amos,并通过使用数据集来构成AMOS 2022挑战。在本报告中,我们介绍了AMOS 2022挑战的解决方案。我们采用具有深远视觉的剩余U-NET作为我们的基本模型。实验结果表明,对于仅CT任务和CT/MRI任务,骰子相似系数和归一化表面骰子的平均得分分别为0.8504和0.8476。
translated by 谷歌翻译
提出了一种表示每个数据集的消化信息的方法,以创新思想的帮助以及试图使用或组合数据集创建有价值的产品,服务和业务模型的数据用户的通信。与通过共享属性(即变量)连接数据集的方法相比,此方法通过在现实世界中应活跃的情况下通过事件,情况或操作连接数据集。该方法反映了每个元数据对特征概念的适应性的考虑,这是预期从数据中获得的信息或知识的摘要;因此,数据的用户获得了适合真实企业和现实生活需求的实践知识,以及将AI技术应用于数据的基础。
translated by 谷歌翻译
Context-aware decision support in the operating room can foster surgical safety and efficiency by leveraging real-time feedback from surgical workflow analysis. Most existing works recognize surgical activities at a coarse-grained level, such as phases, steps or events, leaving out fine-grained interaction details about the surgical activity; yet those are needed for more helpful AI assistance in the operating room. Recognizing surgical actions as triplets of <instrument, verb, target> combination delivers comprehensive details about the activities taking place in surgical videos. This paper presents CholecTriplet2021: an endoscopic vision challenge organized at MICCAI 2021 for the recognition of surgical action triplets in laparoscopic videos. The challenge granted private access to the large-scale CholecT50 dataset, which is annotated with action triplet information. In this paper, we present the challenge setup and assessment of the state-of-the-art deep learning methods proposed by the participants during the challenge. A total of 4 baseline methods from the challenge organizers and 19 new deep learning algorithms by competing teams are presented to recognize surgical action triplets directly from surgical videos, achieving mean average precision (mAP) ranging from 4.2% to 38.1%. This study also analyzes the significance of the results obtained by the presented approaches, performs a thorough methodological comparison between them, in-depth result analysis, and proposes a novel ensemble method for enhanced recognition. Our analysis shows that surgical workflow analysis is not yet solved, and also highlights interesting directions for future research on fine-grained surgical activity recognition which is of utmost importance for the development of AI in surgery.
translated by 谷歌翻译